Facilitate SIMD-Code-Generation in the Polyhedral Model by Hardware-aware Automatic Code-Transformation

نویسندگان

Dustin Feld

Thomas Soddemann

Michael Jünger

Sven Mallach

چکیده

Although Single Instruction Multiple Data (SIMD) units are available in general purpose processors already since the 1990s, state-of-the-art compilers are often still not capable to fully exploit them, i.e., they may miss to achieve the best possible performance. We present a new hardware-aware and adaptive loop tiling approach that is based on polyhedral transformations and explicitly dedicated to improve on auto-vectorization. It is an extension to the tiling algorithm implemented within the PluTo framework [4, 5]. In its default setting, PluTo uses static tile sizes and is already capable to enable the use of SIMD units but not primarily targeted to optimize it. We experimented with different tile sizes and found a strong relationship between their choice, cache size parameters and performance. Based on this, we designed an adaptive procedure that specifically tiles vectorizable loops with dynamically calculated sizes. The blocking is automatically fitted to the amount of data read in loop iterations, the available SIMD units and the cache sizes. The adaptive parts are built upon straightforward calculations that are experimentally verified and evaluated. Our results show significant improvements in the number of instructions vectorized, cache miss rates and, finally, running times.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Automatic Code Generation for SIMD Hardware Accelerators

SIMD hardware accelerators offer an alternative to manycores when energy consumption and performance are critical. For scientific computing, GPGPUs are used in many computers of the top-500. But embedded processors also use accelerators. However such heterogeneous platforms trade ease of developments for performance: The application code and the data must be split between the host and the accel...

متن کامل

PIPS Is not (just) Polyhedral Software Adding GPU Code Generation in PIPS

Parallel and heterogeneous computing are growing in audience thanks to the increased performance brought by ubiquitous manycores and GPUs. However, available programming models, like OPENCL or CUDA, are far from being straightforward to use. As a consequence, several automated or semi-automated approaches have been proposed to automatically generate hardware-level codes from high-level sequenti...

متن کامل

tobias christian grosser Diploma

Sustained growth in high performance computing and the availability of advanced mobile devices increase the use of computation intensive applications. To ensure fast execution and consequently low power usage modern hardware provides multi-level caches, multiple cores, SIMD instructions or even dedicated vector accelerators. Taking advantage of those manually is difficult and often not possible...

متن کامل

Automatic Transformations for Effective Parallel Execution on Intel Many Integrated Core

We demonstrate in this work the potential effectiveness of a source-to-source framework for automatically optimizing a sub-class of affine programs on the Intel Many Integrated Core Architecture. Data locality is achieved through complex and automated loop transformations within the polyhedral framework to enable parallel tiling, and the resulting tiles are processed by an aggressive automatic ...

متن کامل

Efficient Code Generation for Automatic Parallelization and Optimization

Supercompilers look for the best execution order of the statement instances in the most compute intensive kernels. It has been extensively shown that the polyhedral model provides convenient abstractions to find and perform the useful program transformations. Nevertheless, the current polyhedral code generation algorithms lack for flexibility by adressing mainly unimodular or at least invertibl...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2012

Facilitate SIMD-Code-Generation in the Polyhedral Model by Hardware-aware Automatic Code-Transformation

نویسندگان

چکیده

منابع مشابه

Automatic Code Generation for SIMD Hardware Accelerators

PIPS Is not (just) Polyhedral Software Adding GPU Code Generation in PIPS

tobias christian grosser Diploma

Automatic Transformations for Effective Parallel Execution on Intel Many Integrated Core

Efficient Code Generation for Automatic Parallelization and Optimization

عنوان ژورنال:

اشتراک گذاری